AITopics | decision sequence

Collaborating Authors

decision sequence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stochastic Online Greedy Learning with Semi-bandit Feedbacks

Tian Lin, Jian Li, Wei Chen

Neural Information Processing SystemsFeb-18-2026, 21:25:18 GMT

In this paper, we address the online learning problem when the input to the greedy algorithm is stochastic with unknown parameters that have to be learned over time.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.05)
Africa > Middle East > Algeria > Sétif Province > Sétif (0.04)

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Add feedback

Stochastic Online Greedy Learning with Semi-bandit Feedbacks

Tian Lin, Jian Li, Wei Chen

Neural Information Processing SystemsOct-1-2025, 23:27:56 GMT

The greedy algorithm is extensively studied in the field of combinatorial optimization for decades. In this paper, we address the online learning problem when the input to the greedy algorithm is stochastic with unknown parameters that have to be learned over time. We first propose the greedy regret and null -quasi greedy regret as learning metrics comparing with the performance of offline greedy algorithm. We then propose two online greedy learning algorithms with semi-bandit feedbacks, which use multi-armed bandit and pure exploration bandit policies at each level of greedy learning, one for each of the regret metrics respectively. Both algorithms achieve O (log T) problem-dependent regret bound ( T being the time horizon) for a general class of combinatorial structures and reward functions that allow greedy solutions. We further show that the bound is tight in T and other problem instance parameters.

algorithm, greedy regret, sequence, (16 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Instructional Material > Online (0.60)

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Monitoring Robustness and Individual Fairness

Gupta, Ashutosh, Henzinger, Thomas A., Kueffner, Konstantin, Mallik, Kaushik, Pape, David

arXiv.org Artificial IntelligenceJun-3-2025

Input-output robustness appears in various different forms in the literature, such as robustness of AI models to adversarial or semantic perturbations and individual fairness of AI models that make decisions about humans. We propose runtime monitoring of input-output robustness of deployed, black-box AI models, where the goal is to design monitors that would observe one long execution sequence of the model, and would raise an alarm whenever it is detected that two similar inputs from the past led to dissimilar outputs. This way, monitoring will complement existing offline ``robustification'' approaches to increase the trustworthiness of AI decision-makers. We show that the monitoring problem can be cast as the fixed-radius nearest neighbor (FRNN) search problem, which, despite being well-studied, lacks suitable online solutions. We present our tool Clemont, which offers a number of lightweight monitors, some of which use upgraded online variants of existing FRNN algorithms, and one uses a novel algorithm based on binary decision diagrams -- a data-structure commonly used in software and hardware verification. We have also developed an efficient parallelization technique that can substantially cut down the computation time of monitors for which the distance between input-output pairs is measured using the $L_\infty$ norm. Using standard benchmarks from the literature of adversarial and semantic robustness and individual fairness, we perform a comparative study of different monitors in \tool, and demonstrate their effectiveness in correctly detecting robustness violations at runtime.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3711896.3737054

2506.00496

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Ontario > Toronto (0.05)
(9 more...)

Genre: Research Report (0.82)

Industry: Information Technology (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Proportional aggregation of preferences for sequential decision making

AIHubAug-27-2024, 08:02:42 GMT

In various decision making settings, from recommendation systems to hiring processes, often a sequence of decisions are made by a group. A naive approach to decision-making in such scenarios is to select the alternative with the highest supporters in each round. However, this method can lead to unrepresentative outcomes, where a majority dictates all decisions, potentially disincentivizing participation from minority groups. Consider the following example where a group of friends (voters) want to hang out together weekly. They have diverse choices for the activities (alternatives) they approve of every week (round), but only one activity can be chosen as the decision (i.e., the activity which the whole group ends up pursuing even if some don't like it).

algorithm, decision sequence, sequence, (15 more...)

AIHub

Country: North America > United States > California (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.42)

Add feedback

FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation

Wang, Tianfu, Fan, Qilin, Wang, Chao, Yang, Long, Ding, Leilei, Yuan, Nicholas Jing, Xiong, Hui

arXiv.org Artificial IntelligenceMay-1-2024

Virtual network embedding (VNE) is an essential resource allocation task in network virtualization, aiming to map virtual network requests (VNRs) onto physical infrastructure. Reinforcement learning (RL) has recently emerged as a promising solution to this problem. However, existing RL-based VNE methods are limited by the unidirectional action design and one-size-fits-all training strategy, resulting in restricted searchability and generalizability. In this paper, we propose a FLexible And Generalizable RL framework for VNE, named FlagVNE. Specifically, we design a bidirectional action-based Markov decision process model that enables the joint selection of virtual and physical nodes, thus improving the exploration flexibility of solution space. To tackle the expansive and dynamic action space, we design a hierarchical decoder to generate adaptive action probability distributions and ensure high training efficiency. Furthermore, to overcome the generalization issue for varying VNR sizes, we propose a meta-RL-based training method with a curriculum scheduling strategy, facilitating specialized policy training for each VNR size. Finally, extensive experimental results show the effectiveness of FlagVNE across multiple key metrics. Our code is available at GitHub (https://github.com/GeminiLight/flag-vne).

flagvne, node, vnr, (16 more...)

arXiv.org Artificial Intelligence

2404.12633

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Efficient Game-Theoretic Planner for Automated Lane Merging with Multi-Modal Behavior Understanding

Zhang, Luyao, Han, Shaohang, Grammatico, Sergio

arXiv.org Artificial IntelligenceDec-2-2023

In this paper, we propose a novel behavior planner that combines game theory with search-based planning for automated lane merging. Specifically, inspired by human drivers, we model the interaction between vehicles as a gap selection process. To overcome the challenge of multi-modal behavior exhibited by the surrounding vehicles, we formulate the trajectory selection as a matrix game and compute an equilibrium. Next, we validate our proposed planner in the high-fidelity simulator CARLA and demonstrate its effectiveness in handling interactions in dense traffic scenarios.

ego vehicle, equilibrium, vehicle, (13 more...)

arXiv.org Artificial Intelligence

2311.14916

Country:

Europe > Netherlands > South Holland > Delft (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.82)

Industry: Transportation > Ground > Road (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Large Language Model as Autonomous Decision Maker

Ye, Yining, Cong, Xin, Qin, Yujia, Lin, Yankai, Liu, Zhiyuan, Sun, Maosong

arXiv.org Artificial IntelligenceAug-23-2023

While large language models (LLMs) exhibit impressive language understanding and in-context learning abilities, their decision-making ability still heavily relies on the guidance of task-specific expert knowledge when solving real-world tasks. To unleash the potential of LLMs as autonomous decision makers, this paper presents an approach JuDec to endow LLMs with the self-judgment ability, enabling LLMs to achieve autonomous judgment and exploration for decision making. Specifically, in JuDec, Elo-based Self-Judgment Mechanism is designed to assign Elo scores to decision steps to judge their values and utilities via pairwise comparisons between two solutions and then guide the decision-searching process toward the optimal solution accordingly. Experimental results on the ToolBench dataset demonstrate JuDec's superiority over baselines, achieving over 10% improvement in Pass Rate on diverse tasks. It offers higher-quality solutions and reduces costs (ChatGPT API calls), highlighting its effectiveness and efficiency.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2308.12519

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Proportional Aggregation of Preferences for Sequential Decision Making

Chandak, Nikhil, Goel, Shashwat, Peters, Dominik

arXiv.org Artificial IntelligenceJun-26-2023

We study the problem of fair sequential decision making given voter preferences. In each round, a decision rule must choose a decision from a set of alternatives where each voter reports which of these alternatives they approve. Instead of going with the most popular choice in each round, we aim for proportional representation. We formalize this aim using axioms based on Proportional Justified Representation (PJR), which were proposed in the literature on multi-winner voting and were recently adapted to multi-issue decision making. The axioms require that every group of $\alpha\%$ of the voters, if it agrees in every round (i.e., approves a common alternative), then those voters must approve at least $\alpha\%$ of the decisions. A stronger version of the axioms requires that every group of $\alpha\%$ of the voters that agrees in a $\beta$ fraction of rounds must approve $\beta\cdot\alpha\%$ of the decisions. We show that three attractive voting rules satisfy axioms of this style. One of them (Sequential Phragm\'en) makes its decisions online, and the other two satisfy strengthened versions of the axioms but make decisions semi-online (Method of Equal Shares) or fully offline (Proportional Approval Voting). The first two are polynomial-time computable, and the latter is based on an NP-hard optimization, but it admits a polynomial-time local search algorithm that satisfies the same axiomatic properties. We present empirical results about the performance of these rules based on synthetic data and U.S. political elections. We also run experiments where votes are cast by preference models trained on user responses from the moral machine dataset about ethical dilemmas.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2306.14858

Country:

North America > United States > Colorado (0.04)
North America > United States > New York (0.04)
North America > United States > California > Shasta County (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Government > Voting & Elections (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)

Add feedback

ASAC: Active Sensing using Actor-Critic models

Yoon, Jinsung, Jordon, James, van der Schaar, Mihaela

arXiv.org Machine LearningJun-16-2019

Deciding what and when to observe is critical when making observations is costly. In a medical setting where observations can be made sequentially, making these observations (or not) should be an active choice. We refer to this as the active sensing problem. In this paper, we propose a novel deep learning framework, which we call ASAC (Active Sensing using Actor-Critic models) to address this problem. ASAC consists of two networks: a selector network and a predictor network. The selector network uses previously selected observations to determine what should be observed in the future. The predictor network uses the observations selected by the selector network to predict a label, providing feedback to the selector network (well-selected variables should be predictive of the label). The goal of the selector network is then to select variables that balance the cost of observing the selected variables with their predictive power; we wish to preserve the conditional label distribution. During training, we use the actor-critic models to allow the loss of the selector to be "back-propagated" through the sampling process. The selector network "acts" by selecting future observations to make. The predictor network acts as a "critic" by feeding predictive errors for the selected variables back to the selector network. In our experiments, we show that ASAC significantly outperforms state-of-the-arts in two real-world medical datasets.

asac, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

1906.06796

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Online Decisioning Meta-Heuristic Framework for Large Scale Black-Box Optimization

Zhao, Mingde, Ge, Hongwei, Lian, Yi, Chen, C. L. Philip

arXiv.org Artificial IntelligenceDec-16-2018

Out of practical concerns and with the expectation to achieve high overall efficiency of the resource utilization, this paper transforms the large scale black-box optimization problems with limited resources into online decision problems from the perspective of dynamic multi-armed bandits, a simplified view of Markov decision processes. The proposed Online Decisioning Meta-heuristic framework (ODM) is particularly well suited for real-world applications, with flexible compatibility for various kinds of costs, interfaces for easy heuristic articulation as well as fewer hyper-parameters for less variance in performance. Experimental results on benchmark functions suggest that ODM has demonstrated significant capabilities for online decisioning. Furthermore, when ODM is articulated with three heuristics, competitive performance can be achieved on benchmark problems with search dimensions up to 10000.

evolutionary algorithm, machine learning, optimization, (19 more...)

arXiv.org Artificial Intelligence

1812.06585

Country:

Asia (0.28)
North America > Canada (0.28)

Genre: Research Report (1.00)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback